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Abstract 


It  is  known  that  the  Mizuno- Todd- Ye  predictor-corrector  primal- 
dual  Newton  interior-point  method  generates  a  duality-gap  sequence 
which  converges  quadratically  to  zero,  and  this  is  accomplished  with 
an  iteration  complexity  of  0{^/nL).  Very  recently  the  present  authors 
demonstrated  that  the  iteration  sequence  generated  by  this  method 
converges,  and  this  convergence  is  to  the  analytic  center  of  the  solu¬ 
tion  set.  In  the  current  work  we  show  that  within  a  finite  number 
of  iterations  the  Newton  corrector  step  can  be  replaced  with  a  sim¬ 
plified  Newton  corrector  step  and  the  resulting  algorithm  maintains 
0(y/nL)  iteration  complexity,  quadratic  convergence  of  the  duality- 
gap  sequence  to  zero,  and  convergence  of  the  iteration  sequence  (how¬ 
ever  not  necessarily  to  the  analytic  center).  The  simplified  predictor- 
corrector  algorithm  requires  only  one  linear  solve  per  iteration  in  con¬ 
trast  to  the  two  linear  solves  per  iteration  required  by  the  original 
predictor-corrector  algorithm. 


1  Introduction  and  Preliminaries 

The  basic  primal-dual  interior-point  method  for  linear  programming  was  orig¬ 
inally  proposed  by  Kojima,  Mizuno,  and  Yoshise  [4]  based  on  earlier  work  of 
Megiddo  [8].  This  method  can  be  viewed  as  perturbed  and  damped  Newton’s 
method  applied  to  the  first-order  conditions  for  a.  particular  standard  form 
linear  program.  They  established  linear  convergence  and  an  iteration  com¬ 
plexity  bound  of  0(nL)  for  this  basic  algorithm.  Soon  after  Mizuno,  Todd, 
and  Ye  [11]  considered  a  predictor-corrector  variant  of  the  Kojima-Mizuno- 
Yoshise  basic  algorithm.  In  their  algorithm  the  predictor  step  is  a  damped 
Newton  step  and  the  corrector  step  is  a  perturbed  (centered)  Newton  step. 
Hence  one  iteration  of  the  predictor-corrector  algorithm  requires  the  solution 
of  two  linear  systems;  essentially  two  Newton  steps.  Hence  when  comparing 
convergence  rate  results  they  should  technically  be  considered  to  be  two- 
step  results.  Mizuno,  Todd,  and  Ye  established  linear  convergence  for  their 
predictor-corrector  algorithm  and  a  superior  iteration  complexity  bound  of 
0(y/nL). 

We  now  briefly  give  a  chronological  account  of  the  development  of  fast 
(superlinear)  convergence  for  these  primal-dual  interior-point  methods.  We 
refer  to  the  Kojima-Mizuno- Yoshise  method  as  the  basic  method,  and  to 
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the  Mizuno-Todd-Ye  method  as  the  predictor-corrector  method.  When  we 
discuss  convergence  or  convergence  attributes  of  one  of  these  methods  we  are 
describing  the  convergence  of  the  duality-gap  to  zero.  This  interpretation 
has  become  standard  in  this  area.,  even  though  convergence  of  the  duality- 
gap  sequence  does  not  imply  convergence  of  the  iteration  sequence.  The 
convergence  of  the  iteration  sequence  is  certainly  an  important  issue  in  its 
own  right  and  to  some  extent  ha.s  been  neglected.  For  an  interesting  result 
concerning  the  convergence  of  the  iteration  sequence  generated  by  the  basic 
method  see  Tapia,  Zhang,  and  Ye  [12].  For  a  definitive  result  concerning  the 
convergence  of  the  iteration  sequence  for  the  predictor-corrector  method  see 
Gonzaga  and  Tapia  [3]. 

Zhang,  Tapia,  and  Dennis  [19]  demonstrated  that  under  certain  assump¬ 
tions  the  algorithmic  parameters  in  the  basic  method  could  be  chosen  so  that 
superlinear  convergence  was  obtained  for  degenerate  problems  and  quadratic 
convergence  was  obtained  for  nondegenerate  problems.  However,  they  did 
not  demonstrate  that  polynomial  complexity  would  be  retained.  Zhang  and 
Tapia  [18]  demonstrated  that  the  algorithmic  parameters  in  the  basic  algo¬ 
rithm  could  be  chosen  so  that  the  polynomial  complexity  bound  was  main¬ 
tained  and  superlinear  convergence  was  obtained  for  degenerate  problems, 
while  quadratic  convergence  was  obtained  for  nondegenerate  problems.  Ye, 
Tapia  and  Zhang  [16]  demonstrated  that  the  predictor-corrector  algorithm 
was  superlinearly  convergent  for  degenerate  problems  and  quadratically  con¬ 
vergent  for  nondegenerate  problems  while  maintaining  its  O(yfnL)  iteration 
complexity.  McShane  [6]  independently  obtained  a  similar  result.  Up  to 
this  point  all  superlinear  convergence  results  assumed  that  the  iteration  se¬ 
quence  converged.  Ye,  Giiler,  Tapia,  and  Zhang  [15],  and  independently 
Mehrotra  [9],  based  on  Ye,  Tapia,  and  Zhang  [16]  demonstrated  the  surpris¬ 
ing  result  that  neither  the  nondegeneracy  assumption  nor  the  assumption  of 
iteration  sequence  convergence  was  needed  for  the  quadratic  convergence  of 
the  predictor-corrector  algorithm. 

In  this  paper  we  add  to  the  literature  on  the  predictor-corrector  algorithm 
by  demonstrating  that  its  quadratic  convergence  and  0(y/nL)  complexity  are 
retained  if  one  replaces  the  Newton  corrector  step  with  a  simplified  Newton 
step,  i.e.,  the  Jacobian  from  the  Newton  predictor  step  is  used  also  in  the 
computation  of  the  corrector  step.  Hence  the  corrector  step  only  requires  a 
back-solve,  and  the  complete  iteration  only  requires  the  solution  of  one  linear 
system.  Actually  the  Newton  corrector  step  cannot  be  replaced  with  a  sim- 
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plified  Newton  corrector  step  at  the  beginning  of  the  iterative  process,  but 
only  after  a  particular  criterion  is  satisfied.  We  demonstrate  that  this  crite¬ 
rion  will  be  satisfied  within  a  finite  number  of  iterations.  We  also  show  that 
the  simplified  algorithm  generates  an  iteration  sequence  which  is  convergent, 
but  not  necessarily  to  the  analytic  center. 

Recently  Ye  [14]  was  able  to  show  that  a  variant  of  the  Mizuno-Todd- 
Ye  predictor-corrector  algorithm  could  be  given  that  eventually  did  not  re¬ 
quire  the  corrector  step.  He  demonstrated  that  this  variant  algorithm  gave 
subquadratic  convergence  (the  Q-rate  is  two,  but  the  Q2- fact  or  may  be  un¬ 
bounded).  Hence  Ye  attains  a  convergence  rate  of  two  with  an  algorithm 
which  (eventually)  only  requires  one  linear  solve  per  iteration.  Our  simpli¬ 
fied  Mizuno-Todd-Ye  algorithm  gives  {^-quadratic  convergence  but  requires 
the  solution  of  one  linear  system  and  an  additional  back  solve  per  iteration. 
It  should  be  clear  that  any  convergence  rate  analysis  based  on  total  number 
of  arithmetic  operations  per  iteration  will  favor  the  Ye  variant.  It  should  also 
be  clear  that  numerical  efficiency  of  an  algorithm  is  determined  by  effective 
number  ol  iterations  needed  for  numerical  convergence  and  not  convergence 
rate  alone. 

The  paper  is  organized  as  follows.  In  the  remainder  of  this  section  we 
introduce  our  notation  and  several  fundamental  background  notions.  In  Sec¬ 
tion  2  we  discuss  the  primal-dual  Newton  step  and  the  primal-dual  simplified 
Newton  step  and  derive  several  properties  concerning  these  two  steps.  Some 
results  on  scaled  projections  from  Gonzaga  and  Tapia  will  be  collected  in 
Section  3.  These  results  will  be  used  in  Section  5.  The  Mizuno-Todd-Ye 
predictor-corrector  algorithm  is  presented  in  Section  4.  Section  5  begins 
with  the  presentation  ol  the  simplified  predictor-corrector  algorithm  and 
then  turns  to  establishing  our  convergence  theory  for  the  simplified  predictor- 
corrector  algorithm.  In  Section  6  we  make  some  observations  that  imply  that 
quadratic  convergence  is  optimal  for  both  the  predictor-corrector  method  and 
its  simplified  variant.  We  indicate  that  cubic  convergence  might  be  obtained 
by  appropriately  modifying  the  corrector  step. 

Given  a  vector  x,  d,  q i,  the  corresponding  upper  case  symbol  denotes  (as 
usual)  the  diagonal  matrix  X ,  D,  $  defined  by  the  vector. 

We  denote  component-wise  operations  on  vectors  by  the  usual  notations 
for  real  numbers.  Thus,  given  two  vectors  u,v  of  the  same  dimension,  uv, 
u/v,  etc.  denotes  the  vectors  with  components  u,ut-,  ut /»;, ,  etc.  This  notation 
is  consistent  as  long  as  component-wise  operations  are  given  precedence  over 
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matrix  operations.  Note  that  uv  =  Uv  and  if  A  is  a  matrix,  then  Auv  =  AUv, 
but  in  general  Auv  ^  (Au)v. 

We  frequently  use  the  O(-)  and  O(-)  notation  to  express  a  relationship 
between  functions.  Our  most  common  usa.ge  will  be  associated  with  a  se¬ 
quence  { x k }  of  vectors  and  a  sequence  {fik}  of  positive  real  numbers.  In  this 
case  x  =  O(n),  or  xk  —  0(y,k),  means  that  there  is  a  constant  K  (depen¬ 
dent  on  problem  data)  such  that  for  every  k  £  IN,  ||x-fc||  <  K  fik .  Similarly, 
x  =  0,(h),  or  xk  =  kl(nk),  means  that  there  is  e  >  0  such  that  for  every 
k  6  IN,  ||xfc||  >  6fik. 

The  primal  and  dual  linear  programming  problems  are: 


minimize 

T 

c  X 

(LB) 

subject  to 

Ax  = 

b 

x  > 

o, 

and 

maximize 

bTy 

(LD) 

subject  to 

ATy  +  .s  = 

( 

>  0, 


where  c  £  lRn ,  b  £  Mni ,  A  £  ]RmXn .  We  assume  that  both  problems  have 
optimal  solutions,  and  that  the  sets  of  optimal  solutions  are  bounded.  This  is 
equivalent  to  the  requirement  that  both  feasible  sets  contain  points  satisfying 
all  inequality  constraints  strictly. 

Given  any  feasible  primal-dual  pair  (x,  s),  the  problems  can  be  rewritten 


as 

minimize 

~T 
■S  X 

(LP) 

subject  to 

Ax 

=  b 

X 

IV 

_o 

and 

minimize 

xTs 

(LD) 

subject  to 

Bs 

=  Be 

s  >  0, 

where  B 1  is  a  matrix  whose  columns  span  the  null  space  of  A.  Popular 
choices  for  BT  are  an  orthonormal  basis  for  the  null  space  of  A  and  B  =  Pa, 
the  projection  matrix  into  the  null  space  of  A. 
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The  feasible  sets  for  (LP)  and  (LD)  will  be  denoted  respectively  by  V 
and  T>.  Their  relative  interiors  will  be  respectively  V°  and  T>° . 

The  set  of  optimal  solutions  for  the  primal-dual  pair  of  problems  con¬ 
stitutes  a  face  F  =  Fp  X  Fp  of  the  polyhedron  of  feasible  solutions,  where 
Fp  and  Fp  are  respectively  the  primal  and  dual  optimal  faces.  By  hypoth¬ 
esis,  this  face  is  a  compact  set.  It  is  well  known  that  this  face  is  char¬ 
acterized  by  a  partition  {B,N}  of  the  set  of  indices  {l,...,n}  such  that 
Fp  =  {x  V  |  xN  =  0}  and  Fp  =  {.s  £  T>  \  sp  =  0}.  In  the  relative  interior 
of  the  face  F,  xp  >  0  and  .s,v  >  0. 

We  study  algorithms  that  converge  to  the  optimal  face.  Our  main  concern 
is  with  the  behaviour  of  the  iterates  as  they  approach  the  optimal  face.  We 
want  this  to  happen  in  such  a  manner  that  all  limit  points  are  in  the  relative 
interior  of  the  optimal  face.  We  shall  see  later  on  how  this  condition  can  be 
enforced  by  requiring  some  adherence  to  the  central  path.  For  detail  on  the 
central  path  see  Gonzaga  [2], 

Given  //  >  0 ,  //  €  iR,  the  pair  (;c,.s)  of  feasible  primal  and  dual  solutions 
is  the  central  point  (x(fi),  $(//))  associated  with  //,  if 

xs  =  fie, 

where  e  stands  for  the  vector  of  all  ones,  with  dimension  given  by  the  context. 
The  central  path  is  the  curve  in  lR2n  parametrized  by  the  positive  real  fi: 

i.e., 

//.  (:t:(//),.s(//,)). 

Thus  (a;,s)  is  a  central  point  if  and  only  if 

xs  =  fie 
Ax  =  b 
Bs  =  Be 
x,s  >  0, 

where  the  columns  of  BT  span  the  null  space  of  A. 

The  first-order  or  Karush-Kuhn- Tucker  (KKT)  conditions  for  problem 
(LP)  (or  (LD))  are 

xs  =  0 
Ax  =  b 
ATy  +  s  =  c 
x,s  >  0. 


(1) 
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The  perturbed  KKT  conditions,  for  perturbation  parameter  //  >  0,  are 

xs  = 

Ax  = 

ATy  -f-ts  = 
x,  s  >0 

Observe  that  the  perturbed  KKT  conditions  are  merely  the  defining  re¬ 
lations  for  the  central  path  and  (2)  can  equivalently  be  written  as  (1).  Es¬ 
sentially  all  primal-dual  interior- point  methods  for  problem  (LP)  consist  of 
some  variant  of  the  damped  Newton  method  applied  to  the  perturbed  KKT 
conditions  (1)  or  (2). 

2  The  Newton  and  Simplified  Newton  Steps 

When  dealing  with  an  iterative  procedure  we  will  use  the  superscript  0  to 
denote  the  previous  iterate,  no  superscript  to  denote  the  current  iterate,  a 
superscript  of  +  to  denote  the  subsequent  iterate.  In  two-step  algorithms 
like  the  Mizuno-Todd-Ye  algorithm  described  in  Section  4  this  notation  will 
apply  to  the  current  iterate,  the  intermediate  iterate,  and  the  final  iterate. 

Suppose  that  (.x°,s°)  and  (x,  .s)  have  been  obtained  from  a  form  of  New¬ 
ton’s  method  and  are  both  feasible  pairs.  The  Newton  step  (or  correction) 
for  (1)  at  (xys)  is  given  by  (u,v)  the  solution  of 

xv  +  su  =  —xs  A  ye 

Au  =  0  (3) 

Bv  =  0, 

and  the  simplified  Newton  step  for  (f)  at  (x,.s)  is  given  by  (u,w)  the  solution 
of 

x°v  +  s°u  =  —  xs  +  ye 

Au  =  0  (4) 

Bv  =  0. 

It  should  be  clear  that  the  difference  between  (3)  and  (4)  is  that  (3)  uses 
the  Jacobian  of  (1)  at  (x,s)  and  (4)  uses  the  Jacobian  of  (1)  at  (x°,s°). 

We  introduce  some  additional  notation  that  will  be  used  throughout  the 
paper.  Given  a  pair  (x,  .s),  we  define 


ye 

b 


(2) 
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(5) 


//(x,.s)  =  xTs/n 

w(x, s)  =  xs/p(x,s) 

Kx,s)  =  II io(x,s)  -  e|| 

4(x,s)  =  (y/wix,*))-1. 

When  no  confusion  can  arise,  we  drop  the  reference  to  the  variables,  and 
continue  to  use  other  symbols  in  a,  consistent  manner.  For  instance,  given  a 
pair  (x,  s),  the  parameters  above  will  be  denoted  simply  /i,u>  and  (j. 

Given  a  pair  (x,  ,s),  /<(x,.s)  is  the  penalty  parameter  associated  to  (x,s), 
in  the  following  sense:  if  (x,.s)  is  a.  central  point,  then  xs  =  /re;  otherwise  p 
is  the  penalty  parameter  associated  with  the  central  point  that  is  nearest  the 
pair  (x,s),  in  terms  of  a  certain  proximity  measure.  The  vector  iv  consists 
of  logarithmic  barrier  weights  associated  with  (x,  ,s).  It  characterizes  the 
weighted  primal-dual  affine  scaling  trajectory  through  (x,  s),  as  studied  by 
Monteiro  and  Adler  [11],  The  scalar  A  is  a  measure  of  proximity  from  (x,s) 
to  the  central  point  (x(/r),  s(/t)).  The  definition  of  <f>  was  made  merely  for 
convenience;  it  will  simplify  expressions  below. 

At  this  point  we  are  interested  in  obtaining  usable  closed  form  solutions 
lor  the  simplified  Newton  step  and  the  Newton  step.  We  also  derive  an 
interesting  property  of  the  simplified  Newton  step.  In  what  follows  it  is 
important  not  to  confuse  //.  in  (3)  and  (4)  with  p(x,s)  given  in  (5),  because 
they  are  not  necessarily  the  same.  Hence  p  denotes  the  //  in  (3)  and  (4)  and 
l-i(x,s)  means  the  p(x,  .$)  given  in  (5).  Since  no  confusion  will  arise  in  the 
case  of  p°,  we  use  //"  to  denote  p(x(),  .s°). 

Proposition  2.1  The  simplified  Newton  step  (u,v)  given  by  (Jf)  can  be  'writ¬ 
ten 

u  =  x°fP^o$o/  (-7F  + 

>  /  .  1  <  (6) 

v  =  «0(f>l,pAX^<J)0 

■where  P  =  I  —  P. 

Proof.  Assume  that  instead  of  (4),  the  simplified  Newton  equations  are 
written  as 


X°v  +  s°u  =  -xs  +  fie  ,  u  efif(A),  ven{AT)  (7) 
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where  as  usual  A f  denotes  null  space  and  7Z  denotes  range  space. 
The  solution  is  obtained  by  associating  a  scaling  vector 


to  each  pair  (x,  s). 

Using  the  definitions  in  (5)  and  dropping  argument  references  when  no 
confusion  will  arise 


The  solution  of  (7)  is  obtained  by  scaling  the  problems  by  x  —  (<P)~^x  ,  s  = 
d°s  : 

x°v  +  .s°u  —  —  xs  +  fie 
u  e  AT(AD°) 

V  G  K(D0AT) 

The  choice  of  this  scaling  becomes  clear  when  we  notice  that  by  direct  sub¬ 
stitution, 


x°  =  su  =  (9) 

Dividing  the  equation  by  .s°  and  using  the  definitions  of  scaled  variables, 

x  _  _  (p 

U  +  V  =  -—s  +  fi(x  )  1  =  —  (-XS  +  fie). 

Hence  u  and  v  are  the  components  of  the  right-hand  side  in  the  complemen¬ 
tary  subspaces,  the  null  space  and  row  space  of  AD° ,  and  are  given  by 

(P 

u  =  Pado-t,  (~xs  +  fie) 

.  $  (10) 

v  =  U4Do—  (-xs  +  fie), 

where  Pad0  —  I  ~  Pad0-  Finally,  u  =  <Pu  and  v  =  (d°)_1u. 

A  convenient  formulation  is  obtained  by  substituting  <P  =  -~P=x°(j)0  and 

(d0)-1  =  -4=.s °(/>°,  and  this  leads  to  (6).  I 
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The  simplified  Newton  step  and  the  Newton  step  satisfy  an  interesting 
property.  This  property  will  turn  out  to  be  fundamental  to  the  analysis 
presented  in  Section  5.  Hence  we  derive  this  property  in  a  form  which  covers 
both  the  simplified  Newton  step  and  the  Newton  step. 

Proposition  2.2  Let(x,s)  and(x,s)  be  feasible  pairs.  Consider  x+  =  x  +  u 
and  s+  =  s  +  v  where  (u,  u)  satisfies 


xv  -f-  su  —  — (1  —  7).r.s  +  fie 

u  £  Af(A) 

V  £  n{AT)  . 


Then 


p{x+,v+)  =  7 p(x,s)  +  fi  . 


Proof  Left  multiplying  by  eT,  we  obtain 

x1  v  +  sTu  =  —  (1  —  7),T+.s  +  nfi  . 


(11) 


From  the  definition 

x+T$+  =  xTs  +  x+v  +  sTu  , 

since  uTv  =  0.  But  xTv  =  xTv,  because  x  -  x  £  J\f{A)  and  v  £  Tl{AT),  and 
similarly  sTu  =  sTu.  Substituting  in  the  expressions  above  we  immediately 
obtain  (11).  | 


3  Scaled  Projections 

In  this  section  we  collect  some  results  on  scaled  projections  from  Gonzaga 
and  Tapia  [3],  These  results  a.re  extensions  of  results  published  by  Megiddo 
and  Shub  [8].  We  use  1R+  to  denote  the  nonnegative  reals,  and  1R++  to 
denote  the  positive  reals. 

Consider  the  primal  feasible  set  for  (LP), 


V  =  {  x  £  IP  "  |  A  x  =  b,  x  >  0} 
and  the  map  h  defined  for  d  £  IFl'f ,  d  /  0  and  p  £  IRn  by 

(d,  p)  ^  h(d ,  p)  =  PADp,  (12) 
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where  Pad  represents  the  projection  matrix  into  the  null  space  of  AD. 

We  study  the  behaviour  of  this  map  when  d  >  0,  d  — >  d  and  p  — >  p,  where 
d  >  0,  d  7^  0,  and  p  G  //C . 

Given  d,  we  define  the  index  sets  B  —  j  /  =  1, . . . ,  n,  |  (f  >  0}  and 
./V  =  {z  =  |  di  —  0}.  The  variables  with  indices  in  B  are  called 

the  “large  variables,”  and  the  others  are  called  the  “small  variables.”  It  is 
difficult  to  describe  the  behaviour  of  the  small  variables  h^(d,  p)  of  the  scaled 
projection  defined  above.  The  theory  of  Megiddo  and  Shub  concerns  the  large 
variables  hB(d,  p).  We  shall  describe  this  theory  conveniently  extended  to  fit 
our  needs.  The  following  proposition  is  Lemma  3.2  of  Gonzaga  and  Tapia[3]. 
We  refer  the  reader  to  that  paper  for  the  proof. 

Proposition  3.1  Let  h(d,p )  be  given  by  (12).  Consider  ( J,  p)  G  Ef  x 
En,d  ^  0,  and  (dk,pk)  G  E'f  x  En  such  that  ( dk,pk )  — >  ( j, p).  Then 

(i)  hB(dk,pk)  ->  hB[d,p)  =  PAbdPb- 

(ii)  If  pn  =  0,  then  hiv(dk,pk)  — >  0. 

Consider  compact  sets  T  C  En  and  A  C  Erf,  such  that  for  any  d  G  A, 
dB  >  0  and  d ^  =  0,  where  {B,  N}  is  a  partition  of  We  now 

extend  the  proposition  above  for  the  case  of  secpiences  {dfc}  in  Ef+  and 
{ph\  G  IP’1  such  that  dk  — >  A  and  pk  — >  T  *. 

Proposition  3.2  For  the  situation  described  above  we  have  the  following: 

(i)  If  dk  — >  A  and  pk  — >  I’,  then 

hB(dk,pk )  —  PAgDkBpB  — >  0. 

( ii)  If  dk  — ►  d  G  A  and  pk  ->  /)  G  lj  then 

hB(dk,pk)-PABDBpB->0. 

Proof.  Implication  (ii)  follows  from  (i),  since  for  convergent  sequences 
PABDkBPB  — >•  PabDbPB- 

To  prove  (i),  assume  by  contradiction  that  there  exists  e  >  0  and  se¬ 
quences  {dk}  in  Ef+  and  {pk}  in  IP’1  such  that  for  k  -  1,2,... 

|| hB(dk,pk)  —  I’ABDkBPkB\\  >  e-  (13) 

*A  sequence  {zk}  converges  to  aset.  Z  if  d(zk ,  Z)  —*  0,  where  d(zk ,  Z)  =  in  fz€Z  ||z*-z||. 
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Since  the  sequences  {dk}  and  {pk}  converge  to  compact  sets  they  must  be 
bounded.  Hence  they  have  accumulation  points  d,  p,  such  that  for  some 

1C  C  IN,  dk  —¥  d  and  pk  p.  From  the  fact  that  dk  converges  to  A  and 
pk  converges  to  F  and  the  compactness  of  A  and  T,  <1  £  A  and  p  £  F.  From 
Proposition  3.1, 

MC/)  -£->  pAbD,pb, 

and  since  Db  >  0, 

TJ  k  TJ  _  — 

rABDkBf>B  *  P AbDbPB ■ 

Subtracting  these  last  expressions  we  see  that 

M/,/)  -  pAbD,/  -£->  o, 

contradicting  (13)  and  completing  the  proof.  I 

Now  we  present  two  facts  related  to  projections  and  slightly  shifted  scal¬ 
ings. 

Proposition  3.3  Let  q  £  MN  be  such  that  ||ry  -  e||oo  <  a  ,  a  £  (0,0.25), 
and  consider  the  projections  h  =  PAp  ,  h  =  qPAQqp.  Then  \\h-h\\  <  3a||A||. 

Proof.  See  [3].  I 

Given  a  vector  x  £  -K^+,  the  following  map  defines  a  norm 

h  £  lRn  ^  ||/i||,  =  ||.x--1/l||. 

This  is  the  Euclidean  norm  of  the  vector  corresponding  to  h  after  a  scaling 
h  =  x~lh.  This  norm  is  very  usual  in  interior-point  methods. 

The  following  result  shows  that  all  scaled  norms  for  x  in  a  compact  set 
in  the  interior  of  the  positive  orthant  a.re  uniformly  equivalent. 

Proposition  3.4  Let  A  C  be  a  compact  set.  Then  there  is  a  number 

T  >  0  such  that  for  any  h  £  lBn ,  x  £  A, 

i||'*ll  <  11*4  <  r||*||. 


Proof.  By  definition,  given  x  £  A, 
obtain 


|  x  1  />.  1 1 .  We  immediately 


min  xt  1  ||/i||  <  ||/i||r  <  max  x] 

t=l,...,n  t=l,...,n 


Since  Xi  ,  i  =  1, . . .  ,n  are  bounded  and  bounded  away  from  zero  for  x  £  A, 
the  scalar  T  must  exist,  completing  the  proof.  I 
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4  The  Mizuno-Todd-Ye  Algorithm 

The  Mizuno-Todd-Ye  (MTY)  algorithm  is  a  path-following  predictor-corrector 
algorithm.  All  activity  is  restricted  to  a  region  near  the  central  path,  i.e.,  all 
points  (x,  ,s)  generated  by  the  algorithm  satisfy 

xs 

6(x,$)  =  IHxvs)  -  e||  =  11  J  '  -  e ||  <  a, 

/X(.T,  S) 

where  a  G  (0,  0.5). 

We  shall  describe  a  typical  iteration  of  the  algorithm  and  list  its  proper¬ 
ties.  Complete  proofs  can  be  found  in  Mizuno,  Todd,  and  Ye  [10]. 

Given  a  =  0.1*,  a  typical  iteration  begins  with  feasible  (x°,s°)  such  that 
6(x°,.s°)  =  ||tw°  —  e ||  <  o,2/\/ 2. 

Predictor  step:  Given  (x°,  .s°)  compute  the  (affine-scaling)  step  (u°,  v°)  and 
let  x  =  x°  +  u°,  s  —  s°  -f  v°,  where  (u°,  u°)  is  defined  by 

A°  +  A°  =  -( i-7)xV,  u°eAf(A),  v°en{AT), 

with  7  G  [0, 1)  such  that  S(x,  s)  —  ||u;(:c,  .s)  —  e||  <  a.  (The  specific  choice  of 
7  will  be  discussed  below.) 

Corrector  step:  Given  (x,s)  compute  the  (centering)  step  ( u ,  v )  and  let 
x+  =  x  +  u,  =  s  +  u,  where  (u,  v)  is  defined  by 

xv  +  su  —  —xs  +  /it:,  u  G  Af(A ),  v  G  7Z(At) 

with  //.  =  fi(x,s). 

Observe  that  our  7  in  the  predictor  step  is  effectively  a  steplength  pa¬ 
rameter.  To  see  this  let  us  denote  the  predictor  step  by  (u°(7),  f°(7))  and 
let  0  =  1  —  7.  Then 


0(u°(O),i;°(O))  =  («0(7),Vo(7)) 

*The  original  paper  uses  a  —  0.5.  We  shall  use  a  convenient,  value  of  0.1,  since  this 
simplifies  some  formulas  ahead. 
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and 


(x,s)  =  (x°,s0)  +  9(u°(  0),7°(0)); 

which  is  the  usual  wa,y  of  writing  the  MTY  predictor  step.  The  usual  choice 
for  9  is  9k ,  the  largest  9  £  (0, 1]  such  that  S(x(9),  s(9))  <  a  for  all  0  <  8  <  9k. 
For  further  detail  see,  for  example,  Section  2  of  Ye,  Giiler,  Tapia,  and  Zhang 
[15].  Hence  our  choice  for  7  in  the  predictor  step  is  7  =  1  —  8k,  and  can  be 
viewed  as  the  smallest  7  £  [0, 1)  in  the  sense  just  described. 

Mizuno,  Todd,  and  Ye  [10]  prove  that  the  algorithm  is  well  defined,  in  the 
sense  that  the  centering  step  produces  (x+,  s+)  such  that  £(.x+,  s+)  <  o? /\J2. 
Ye,  Giiler,  Tapia,  and  Zhang  [15]  (and  independently  Mehrotra  [9])  prove 
that  the  duality-gap  (or  equivalently  the  parameter  //)  converges  to  zero  Q- 
quadratically,  i.e. , 


Using  Proposition  2.2  with  (ri:,s 
that  for  the  corrector  step 


,V)  =  0(/!). 

)  =  (x°,.s0),  7  =  7,  a,nd  fi  —  0,  we  see 


fi(x:  s)  =  7/i(x°,  s°)  ■ 

Using  Proposition  2.2  with  (x,s)  =  (x,s),  7  =  0,  and  fi  =  7//(.x°, s°),  we  see 
that  for  the  corrector  step 


/i(*+  s+)  =  7/*(.T°,a0). 

So,  on  one  hand  we  have  //,+  =  0(//,l)J)  and  on  the  other  hand  we  have 
//+  =  7//,0.  It  follows  that. 

7  - 

Bounds  on  the  quantities  appearing  in  the  algorithm  are  given  in  the 
propositions  below.  Let  { B ,  N}  be  the  optimal  partition  for  the  linear  pro¬ 
gramming  problem,  i.e.,  the  index  partition  associated  to  the  optimal  face. 
It  is  well  known  (see  Adler  and  Monteiro  [1])  that  the  central  path  ends  at 
the  analytic  center  of  the  optimal  face,  and  that  the  pairs  (,x,  s)  such  that 
||rc(x,s)  —  e ||  <  a  constitute  a  neighborhood  of  the  central  path  correspond¬ 
ing  to  the  bundle  of  u;- weighted  affine-scaling  trajectories  for  w  such  that 
||iu  —  e ||  <  a.  For  a  small,  the  bundle  of  trajectories  ends  in  a  compact 
neighborhood  of  the  analytic  center  of  the  optimal  face,  contained  in  the 
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relative  interior  of  the  face.  Namely,  the  end  points  in  the  primal  optimal 
face  are  the  re-weighted  centers  given  by 

x*(w)  =  argmin  <  ^  wt  log  xt  |  Ab%b  =  b 
lies 

Hence,  the  algorithm  behaves  as  follows.  As  the  optimal  face  is  approached 
(and  this  happens  in  polynomial  time),  xkN  — »  0  ,  skB  — >  0  and  xkB,  skN  remain 
in  small  neighborhoods  of  x*B  and  s*N,  the  analytic  centers  of  the  primal  and 
dual  optimal  faces. 

Actually,  it  is  always  true  that  xk  — >  x*  ,  sk  -+  s*,  due  to  the  results 
proved  in  Gonzaga  and  Tapia  [3],  which  we  describe. 

As  was  stressed  in  the  beginning  of  Section  6  of  Gonzaga  and  Tapia  [3], 
it  is  important  to  realize  that  our  estimates  do  not  require  (x°fc+1 ,  .s°*+1)  to 
be  related  to  (x+  ,  s+  ),  i.e.  (x°  ,  .s°  )  does  not  have  to  be  generated  by  the 
MTY  algorithm.  All  that  is  required  is  that  (x°k,s°k)  satisfy  the  condition 
re||(.T°  ,  s°  )  —  e ||  <  a,  for  the  appropriate  choice  of  a.  Hence  in  what  follows 
in  this  section  and  in  Section  5  we  employ  this  broad  interpretation  when 
discussing  quantities  generated  by  the  MTY  algorithm  or  the  simplified  MTY 
algorithm  for  only  one  iteration. 

Proposition  4.1  Consider  quantities  generated  by  the  MTY  algorithm.  Then 

(i)  XN  =  0(g)  ,  sB  =  0(g)  ,  x°N  =  0(g° )  ,  s°B  =  0(g°) 

(ii)  u°  =  0(g°)  ,  v°  =  0(/i°) 

(Hi)  uN  =  0(g)  ,  vB  =  0(g) 


Proof.  See  Lemma  5.1  of  [3].  ■ 

The  proposition  above  shows  that  the  variations  in  (x,  s)  due  to  either  an 
MTY  predictor  or  corrector  step  are  bounded  by  0(//°),  with  exception  of  uB 
and  Vj\.  These  are  the  variations  in  the  large  variables  due  to  the  corrector 
step. 

The  following  proposition  is  the  main  result  in  Gonzaga  and  Tapia  [3].  It 
is  related  to  the  map  that  associates  to  a  pair  (x°,  .s°)  the  pair  (x+ ,  s+)  result¬ 
ing  from  a  MTY  iteration.  The  proposition  says  that  near  the  optimal  face, 
a  MTY  iteration  causes  the  large  variables  to  approach  the  large  variables 
of  the  analytic  center  (x*,s*)  of  the  optimal  face.  The  proposition  describes 
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only  the  behaviour  of  the  primal  variables;  the  dual  variables  behave  in  a 
similar  fashion,  due  to  the  symmetry  of  the  optimality  conditions  (1). 

The  approach  to  the  center  is  measured  in  the  norm  relative  to  x*Bl  defined 
for  h  6  En  by  ||/ib||*  =  ||(a':B)~1/lfl||- 

Proposition  4.2  Consider  a  sequence  (not  necessarily  generated  by  the  algo¬ 
rithm)  ( x°k,s°k )  of  primal- dual  pairs  such  that  S(x°k  ,s°k)  <  0.1  and  p°k  — ►  0. 
Then  there  exists  a  sequence  of  positive  reals  { ek}  such  that  ek  — ►  0  and  for 
sufficiently  large  k, 

114*  -  x*b\\*  <  max{cfc, 0.8\\x°g  -.Tb||»}. 

Proof.  See  Lemma  6.2  of  [3].  I 

This  result  implies  that  the  iterates  approach  and  thus  the  se¬ 

quence  generated  by  the  algorithm  converges  to  the  central  optimum. 

We  are  now  concerned  with  bounding  the  sum  of  the  variations  (correc¬ 
tions)  made  to  either  the  .r-variable  or  the  s- variable  in  either  the  predictor 
step  or  the  corrector  step  in  all  iterations.  The  variation  in  x  due  to  a  pre¬ 
dictor  step  is  u°.  By  the  total  variation  in  x  due  to  predictor  steps  we  mean 
ffk  || u°  ||-  If  we  do  not  mention  predictor  steps  or  corrector  steps  we  mean 
both  steps.  Analogous  terminology  is  used  for  corresponding  situations. 

k  k 

Proposition  4.3  Consider  quantities  x°  ,  s°  ,  xk,  etc.  generated  by  the 
MTY  algorithm  starting  at  (4 ,  4 ). 

Then 


M  E£i  hok  =  0(,P1). 

(ii)  The  total  variation  in  xjq  and  in  sB  is  bounded  by 

(in)  The  total  variation  in  B  and  in  Sn  due  to  predictor  steps  is  bounded 

by  0(p°l). 


Proof.  To  prove  (i),  it  is  enough  to  show  that  for  some  constant  /?  £  (0,1), 
pk+ 1  <  /3pk.  This  was  shown  by  Mizuno,  Todd,  and  Ye  [10]  when  proving 
the  polynomiality  of  the  algorithm.  Now  (ii)  and  (iii)  are  direct  consequences 
of  Proposition  4.1,  completing  the  proof.  I 
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5  The  Simplified  Mizuno-Todd-Ye  Algorithm 


The  simplified  MTY  algorithm  is  the  MTY  algorithm  with  the  Newton  cor¬ 
rector  step  replaced  by  a  simplified  Newton  step.  This  means  that  the  com¬ 
putation  of  the  projections  in  (6)  for  the  corrector  step  are  reduced  to  a  back 
substitution,  instead  of  a  complete  solution  of  the  system. 

We  now  state  the  complete  algorithm. 

Algorithm  5.1  Given  a  <  0.1,  and  feasible  (.xo1,  .sq1  )  such  that  8(xo1 ,  so1)  < 

%  set  k  =  L 


REPEAT 


o  ok  o  ak 
x  :=  x  ,  s  :=  s  ,  // 


o _ 


Predictor:  Given  (x°,  .s°)  compute  (m°,  u°),  and  let  x  :=  x°  +  u°,  s  := 
s°  +  v°  where  (u°,  v°)  satisfies 

x°v°  +  s°u°  —  —(1  —  'y)x°s°,  u°  6  N(A),  u°  €  1Z(AT ), 
and  7  is  as  in  the  MTY  predictor  step. 


Simplified  Corrector:  Given  set  //  :=  //,(.r, s).  Compute  (u,v) 

satisfying 

x°v  +  s°u  =  —xs  +  /te,  u  6  J\f(A),  v  €  'R,(AT), 
and  set  x+  :=  x  +  u ,  s+  :=  .s  +  fi. 


Safeguard:  IfW(x+,s+)  >  or/2,  then  discard  (®+,5+)  and  compute  the 
Newton  corrector  step 

.Ti)  +  sa  =  —  ars  +  fie,  u  €  W(A),  v  €  1Z(AT), 
and  set  :c+  :=  .r  -)-  u,  .s+  :=  .s  +  u. 


Subsequent  iterate: 

ofc+i  +  o^+i  + 
x  :=  ,s  :=  ,sT 

k  \=  k  A  1 


UNTIL  convergence. 

Before  we  formally  state  the  convergence  properties  that  we  have  derived 
for  the  simplified  predictor-corrector  algorithm,  there  is  value  in  collecting 
some  fundamental  observations.  In  what  follows  all  quantities  should  be 
indexed  by  k\  however  as  we  have  been  doing  above  we  will  not  always  write 
the  index  k. 


17 


Proposition  5.2  Let  {(x°,s°)k,(x,s)k,(x+ ,s+)k]  be  generated  by  the  sim¬ 
plified  MTY  predictor- corrector  algorithm.  Then 

(i)  x+Ts+  =  xTs 
(a)  xT s  —  jx°T  s° 

(in)  7  =  0(x°Ts°) 

(iv)  xT s  <  (1  —  -^)x°T .s°  for  some  8  >  0  that  does  not  depend  on  k. 

Proof.  The  proof  of  (i)  follows  from  Proposition  2.2  with  (x,  .s)  =  (x,s), 
7  =  0,  and  (i  =  /i(x,.s).  The  proof  of  (ii)  follows  from  Proposition  2.2  with 
(x,s)  =  (x°,s°),  (x,s)  =  (®°,s°),  7  =  7  and  //  =  0.  Both  (iii)  and  (iv)  follow 
from  Theorem  4.1  of  Ye,  Giiler,  Tapia,  and  Zhang  [15],  once  we  observe  that 
their  jd  is  related  to  our  a  by  the  relationship  (d  =  |  and  their  steplength  0 
is  related  to  our  7  by  the  relationship  0  =  1  —  7.  I 

The  algorithm  uses  a  simplified  Newton  iteration  in  the  corrector  step. 
If  the  simplified  corrector  produces  the  reduction  in  the  proximity  8  that 
ensures  the  quadratic  convergence  of  the  algorithm,  i.e.,  if  £(jc+,s+)  <  a/2, 
then  the  step  is  accepted.  Otherwise  the  simplified  step  is  discarded  and  the 
algorithm  performs  a  Newton  corrector  step. 

Two  things  must  be  proved:  first  that  the  iterates  are  still  convergent, 
not  necessarily  to  the  analytic  center  of  the  optimal  face,  and  second,  that 
the  safeguard  cannot  be  activated  more  than  a  finite  number  of  times. 

The  predictor  step  is  the  same  as  that  for  the  MTY  algorithm.  Our 
analysis  will  be  based  on  a  comparison  of  the  simplified  and  exact  corrector 
steps.  The  conclusions  will  be  the  following:  For  points  near  the  optimal  face 

(i)  The  simplified  corrector  step  does  not  center  the  large  variables.  The 
variation  in  xb  and  sN  due  to  simplified  steps  will  be  bounded  by  0(//,°). 

(ii)  The  behaviour  of  the  small  variables  x,pj  and  sb  tends  to  be  identical 
in  both  methods. 

These  two  facts  will  be  proved  and  then  used  to  contradict  the  hypothesis 
that  the  safeguard  is  activated  an  infinite  number  of  times. 

We  begin  by  studying  the  behaviour  of  the  large  variables. 
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Proposition  5.3  Consider  the  corrector  directions  (uk,vk)  and  (uk ,vk)  gen¬ 
erated  at  iteration  k  of  Algorithm  5.1  (independently  of  which  one  is  actually 
accepted  by  the  algorithm) .  Then  there  exist  a  number  K  >  0  and  sequences 
{0k},  {0k}  in  2R+  such  that  0k  — >  0,  9k  — >  0  and 

114 II  <  iki<(hkB\\  +  €) ,  II4II  <  7*^(11411  +  O- 

Hence  ub  =  0(p,°)  and  =  0(/C). 

Proof.  We  shall  prove  the  result  for  ukB.  The  proof  of  the  other  result  is 
similar. 

Dropping  the  index  k:  for  notational  simplicity,  the  primal  directions  are 
computed  from  (6): 


“  -  x°<I>0Pax°< ^  +  ^e)  ’ 

(xs 

- h  e 

I1 


Substituting  p  —  7 p,°,  we  obtain  for  p  = 


9 


-  =  *V*iw*V 

7 

•a  =  x(j)PAX<i,<f>p. 

k 

The  points  xk  and  x°  approach  the  relative  interior  of  the  optimal  face, 
converging  to  a  small  compact  neighborhood  of  the  central  optimum  x* .  The 
vectors  <f>  and  have  the  following  bounds. 

By  construction,  w°  £  [0.95,1.05],  wt  £  [0.9, 1.1].  Since  =  i/^/wf  by 
definition,  the  following  bounds  can  be  easily  checked: 

6° 

$  £  [0.97, 1.03]  ,  <f>i  £  [0.95, 1.06]  ,  4  £  [0.92, 1.08].  (14) 

Yl 

Thus  and  xf  also  converge  to  compact  sets.  Since  ||p||  =  S(x,s)  < 
0.1,  the  vectors  fp  and  <f>°p  are  also  in  compact  sets,  and  we  can  use  Propo¬ 
sition  3.2  to  obtain 
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(15) 


nf  —  xb(I)bPabx%*0b<I,bPb  —>  0) 

UB  —  Xb^bPabXbQb^BPB  —*  0. 

The  scaled  projections  above  a,re  almost  in  the  format  required  by  Proposition 
3.3,  on  slightly  shifted  scalings.  To  put  them  in  the  desired  format,  let  us 
write 

PB  =  4(4rv 

Due  to  Proposition  4.1,  since  xB  —  D(l),  we  have 

xb  —  •' t'B  +  ub  =  xs(e  +  0(p°))  ■ 

It  follows  that  =  Xg1(e  +  0(p°)).  Thus 

pB  =  x%x^pB{e  +  0{/))  =  x%XglpB  +  0{n°). 

Since  0(p°)  — y  0,  (15)  can  be  written  a.s 


(16) 


ub 


d  u  /  u  o  u  / u  —l 

—  x B<p Br Ag x°B<t>°Bx bV bx b  Pb 
ub  —  XB(t>BPABxB$B  x  b^b-i'b1  Pb 


0, 

0. 


(17) 

(18) 


Defining  q  = 


xb<Pb 


,  we  see  from  (14)  and  (16)  that  for  p  sufficiently  small, 


qi  €  [0.9, 1.1],  and  thus  \\q  —  e’Hoo  <  0.1.  Now  (17)  can  be  written  as 


ub 

7 


-  XB (f>BqPA B XB$BQxB<f>B <1  XBl  PB  0. 


(19) 


Defining  hB  —  qP abx  B<b  bqx  B$Bq  xB  Pb ,  hB  =  Pabxb^bxb^b  xb  Pb,  we 
see  from  Proposition  3.3  that 

|| hs  —  /ib||  <  0.3]|^b||- 

Dividing  (18)  by  xb4>Bi  and  using  scaled  norms,  it  follows  that 

IIub|U-b0b  “  IMI  0.  (20) 

Subtracting  (18)  from  (19)  establishes  that 

Ur 

~2  -  UB 


XB<t>B 


+  h 


B 


h, 


0, 


(21) 


20 


or  (making  the  iteration  indices  explicit), 


,  u. 


sy  IS. 


—  'll 


B  I 


<  lifts  -  Aell  + 


<  0-3||k^||  +<rf 


<  0.3 


u 


B  I 


i  k 

ikB  +  a2  > 


rr. 


where  the  last  inequality  comes  from  (20),  with  — ►  0. 

Using  Proposition  3.4  twice  to  relate  ||  •  \\xk  Ak  and  ||  •  ||,  we  see  that  there 
exists  a  constant  K\  >  0  such  that 


where  9X  — ►  0.  Finally, 

ii^ii  <ii4n+ Ah  Kii+e 

The  final  statement  follows  from  the  fact  that  {t/#}  and  { Vg }  are  bounded, 
and  7  =  0(/i°)  from  (iii)  of  Proposition  5.2. 

■ 


k  k 

Proposition  5.4  Consider  the  quantities  x°  ,  s°  ,  x,  ,  etc.  generated  by 
Algorithm  5.1,  starting  at  (.t°  ,.s°  ).  Then 

(i)  The  total  variation  in  (x,  s)  due  to  simplified  Newton  steps  is  bounded 
by°(go1). 

(ii)  The  sequences  {(.x°  ,s°  )}  and  {(xk,sk)}  converge  to  a  pair  (aqs)  in 
the  optimal  face. 

If  the  safeguard  is  activated  an  infinite  number  of  times  * ,  then  (x,s)  = 
( x*,s *),  the  central  optimal  pair.  Otherwise  (x,  s)  is  not  necessarily  equal  to 


Proof  (i):  Recall  that  /fix,  s)  =  7 /i(x°,s°)  and  apply  Proposition  2.1 
with  //  =  7/i°  to  obtain 

u  —  'yx0f0PAx°$0(l)Up 

*We  shall  prove  below  that,  this  hypothesis  is  vacuous,  but  it  will  be  needed  to  establish 
a  contradiction. 
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and 


v  -  7-sV°^4X°<i»0^(V 

where  p  —  (—  y  +  e).  Since  S(x,s)  <  |,  we  see  that 


=  6(x,s)  < 


2  ‘ 


Moreover,  since  S(x°,  su )  —  ||uj(.t°,  .s°)  —  e||  <  \  we  see  that  the  components  of 
w(x°,  s°)  are  contained  in  [|,|]  ;  hence  the  components  of  (f>°  are  contained  in 

^/|,  y/2  .  Also  the  sequence  { (x°k ,  .s0* }  is  bounded,  and  projection  operators 
are  bounded.  It  follows  from  the  above  expressions  and  the  fact  that  all 
quantities  are  bounded  that  u  —  0( 7)  =  0(p°)  and  v  —  0(j)  =  0(p°). 

(ii):  If  the  safeguard  is  activated  a  finite  number  of  times,  the  conclusion 
follows  from  (i),  because  then  the  sequences  generated  by  the  algorithm  are 
Cauchy  sequences.  Otherwise,  the  convergence  proof  is  similar  to  the  proof 
for  the  MTY  algorithm,  presented  in  Gonzaga  and  Tapia  [3]. 

We  shall  prove  the  result  for  the  primal  variables.  The  proof  for  the 

k 

dual  slacks  is  similar.  Also,  it  is  enough  to  prove  that  x°  — >  £•*,  since 
u°k  -  0(//)  ->  0. 

Assume  by  contradiction  that  the  sequence  {.r°  }  has  an  accumulation 
point  x  7^  x* .  Since  x/v  =  x*N  =  0,  we  have 


rr  =  II xB  -  XB\ 


>  0. 


Let  K,  C  IN  be  the  set  of  iterations  in  which  the  safeguard  is  activated  (MTY 

iterations).  Our  first  step  is  to  show  that  x  must  also  be  an  accumulation 
k 

point  of  (x°  )keK:. 

k  K, 

Let  /Ci  C  IN  be  a  subsecpience  such  that  x°  —A  x,  and  let  j(k)  be  the  first 
index  in  /C  greater  than  or  equal  to  k.  Then  for  any  k  £  K.  1,  ||x0J^  —  x°  ||  = 
0(p°k)  by  (i),  and  thus  x01^  j:  Thus  it  is  enough  to  consider  in  our 
assumption  subsequences  in  K-. 

Let  {efc}  be  the  sequence  given  by  Proposition  4.2,  and  let  k  be  such  that 
for  k  >  k  the  conclusions  of  that  proposition  are  valid  and  efc  <  0.5(J. 

Choose  an  index  j  >  k  with  the  following  characteristics:  j  E  AC,  \\x°B:’  — 
Xg||*  <  1.1(7,  and  the  total  variation  of  x  due  to  simplified  steps  after  j 
satisfies 

X>0fc+1-  *“*11.  <0.05^7.  (22) 

k£K. 

k>j 
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Such  an  index  exists  by  definition  of  a  and  by  (i).  We  shall  prove  by  induction 
that  for  k  £  /C,  k  >  j,  ||xjgfc  —  x^||*  <  0.95cr. 

(a)  \\x°B3+1  —  x*B\\*  <  0.8 x Lirr  <  0.9a  by  Proposition  4.2.  Let  k'  =  j(j  + 1) 
be  the  next  index  in  1C.  Using  (22), 


|  0  k' 
\XB 


bB  I 


< 


X 


0  J+l 
B 


—  X. 


si 


+ 


14"' 


X 


0  J+l  I 
B  \ 


(b)  Assume  that  for  an  index  k  £  JC,,  k  >  j, 


lx. 


x, 


by  Proposition  4.2, 


0 

4 


X 


si 


<  max{d,  0.8||xg  —  x*B 


(a),  using(22),  let  k'  =  j(k  +  1)  be  the  next  index  in  K,\ 


<  0.95a. 

|»  <  0.95a.  Then 
I*}  <  0.9a.  As  in 


bB 


—  x 


u  < 


|X 


fc+1 


s 


+  X 


B 


X 


0  fc+1 1 
s  I 


<  0.95a. 


(a,)  and  (b)  prove  that  for  all  k  £  JC.,  k  >  j i,  \\x°B  —  Xg||*  <  0.95a, 
contradicting  the  fact  that  a  is  an  accumulation  point  of  the  sequence  (||x^  — 
411  *)k£K->  and  completing  the  proof.  I 

Having  described  the  behaviour  of  the  large  variables,  we  can  now  com¬ 
pare  the  small  variables  for  the  exact  and  simplified  Newton  corrector  steps. 

At  a  typical  iteration,  the  simplified  step  (u,v)  and  the  exact  step  (u,v) 
satisfy  the  equations  below: 


4ub  +  4  ub  = 

x°NvN  +  S%UN  = 

-XBSB  +  fieB 
—xnsn  +  //e/v 

(23) 

XBVB  +  SBUB  = 

XNVN  +  SNUN  = 

—xBsb  +  / ieB 
-xn$n  +  /'-e/v 

(24) 

where  fi  —  7 //u  ,  7  =  0(/t°). 


Before  we  state  the  main  result,  we  establish  some  relationships  within  a 
typical  iteration: 

(i)  (Large  variables)  Since  u°  =  O(fi0 )  ,  u°  =  0(/iu)  and  all  components 
of  xB  and  .s,v  are  bounded  away  from  zero, 

4  =  xB(e  +  0(/4)  >  4  =  sn((‘  +  0(//°)) 

(ii)  (Small  variables)  By  construction, 


(25) 


where  w “  G  [0.95, 1.05]  ,  ug  G  [0.9, 1.1]  ,  *  =  1,. . .  ,ra.  Dividing  these  expres¬ 
sions, 

1  5JV  w%  S<B  1  XB  1,,<B 

XN  7  SN  WN  ’  •t'B  7  XB  WB 

From  (25),  it  is  immediate  that  sn/s%  =  (e  +  0(p°)),  and  xB/x°B  =  (e  + 
0(g0)).  By  a  simple  calculation,  w{-  /wi  G  [0.85, 1.17],  i  =  1, . . . ,  n. 

Defining 


(Tjv 


S%  WN 


XB  W% 

aB  =  — - , 

XB  WB 


it  follows  that  for  sufficiently  small  //°, 


(Ti  G  [0.8, 1.2] 


and  we  can  write 


XN  ~  —VNXN  ,  =  GB&B-  (26) 

7  7 

Proposition  5.5  Consider  an  application  of  Algorithm  5.1.  Then  the  safe¬ 
guard  cannot  be  activated  an  infinite  number  of  times. 

Proof.  Assume  by  contradiction  that  the  safeguard  is  activated  at  the 
iterations  with  indices  in  an  infinite  set  1C. 

From  Proposition  5.4,  the  sequences  (x°,s°)k  and  (re,  s)k  converge  to  the 
analytic  center  (:r*,.s*)  of  the  optimal  face.  It  follows  that. 


uk  0  ,  vk  0.  (27) 

Let  us  substitute  the  relations  (25)  and  (26)  into  the  Newton  equations  (23). 
We  shall  analyse  the  first  equation  (indices  in  F?);  the  analysis  for  the  other 
one  is  similar.  Our  approach  is  to  compare  the  behaviour  of  the  small  vari¬ 
ables  in  the  simplified  and  exact  corrector  steps.  To  begin  with 

(e  +  0(fi>))xBvB  + -crBsBUB  =  — x's-Sr  +  g,eB.  (28) 

7 

Subtracting  (24)  from  (28),  and  restoring  the  iteration  indices, 
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((e  +  0(fl°  ))'0g  —  VB)xB  —  —  (-^aBUB  —  4- 

Taking  norms, 

ll(e  +  0(//))4-  4)411  <  H4lloo  +  ll*lll). 

From  Proposition  5.3,  11411/7*  <  /\(||4ll  +  #*),  where  0k  — >■  0.  Since 
II^bIIoo  <  1-2  for  sufficiently  large  k ,  and  j| || 00  =  0(fik)  by  Proposition  4.1, 
the  inequality  becomes 


ll(e  +  0(//)) 


<  0(/x*)(1.2AT(||«^||  +  ekx)  +  ||u|||) 

<  AV(IK4II+<£), 


where  K\  is  a  constant  that  depends  on  the  problem  data.  Since  ukB  — ►  0  by 
(27),  and  since  xB  has  all  components  bounded  away  from  zero,  we  conclude 
that 


(e  +  0(/)) 


Vg  —  V 


B 


+ 


0(fik) 


and  since  f.ik  — >  0, 


/r 


(29) 


The  second  expression  is  obtained  by  a.  similar  process,  using  the  second 
equation  in  (23). 

Now  we  shall  establish  a.  contradiction.  At  a  typical  iteration,  let 


+  {x  +  u)(.s  +  v)  A  ( X  +  u)(s  +  v) 

W  —  -  ,  w  =  - 

ft  fl 

From  the  analysis  of  the  MTY  algorithm  presented  in  Section  4,  we  see  that 


w 


a2 

<  — =  <  0.01  . 
“  \/2 
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At  any  iteration  k  6  /C, 


||u>+  -  e||  >  >  0.05. 

At  such  an  iteration,  either  \\w%  —  ejv||  >  0.02  or  \\wg  —  efj||  >  0.02.  Assume 
that  at  an  infinite  number  of  iterations  K\  C  JC,  ||u;^  —  ejv||  >  0.02  (the 
analysis  for  the  other  case  is  completely  similar). 

Then  for  k  E  1C i, 


\\w%  —  e7v ||  >  0.02  ,  \\'u>n  —  ejv||  <  0.01 

This  implies  that  in  these  iterations. 

||wjv-uw||  -  II W  -  cjv)  -  («>jv  -  cjv)||  >  0.01  (30) 

On  the  other  hand,  we  have  by  definition, 

=  (XN  +  "n)  (#n  +  Vflf) 

/J-Wpf  =  (xN  +  h/v)  {$n  +  vn) 


Subtracting, 

n(u>tr  ~  1‘!n)  =  {*n  +  un)  (»n  +  vN)  ~  (xn  +  un )  (sjv  +  vN). 

Reordering  terms  in  this  expression,  we  obtain 

+  .  un  —  un  ,  ,  N  ,  XN  +  un  ,  a 

Wjf  -  WN  =  - [$N  +  VN)  +  - [VN  -  VN). 

//.  //. 

Let  us  analyse  the  terms  in  the  right-hand  side  (restoring  the  index  k ): 


(i)  By  (29), 


ii, 


N  ~  aN  t  k 


/'■ 


(SN  +  VN ) 


0. 


(ii)  By  Proposition  4.1,  xkN  =  0(/ik )  and  ukN  =  0(fik). 
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From  (27),  v ^ 


0.  From  Proposition  5.3,  vkN 


0.  Hence 


,  x 


N 


+  UN 


/'■ 


(4  ~  4)||  <  I<2 


JN 


JN I 


where  /F2  depends  on  problem  data,  and  so  this  term  converges  to  zero. 

We  conclude  that  ('<n%)k  —  - *  0,  contradicting  (30),  and  completing 

the  proof.  1 

We  are  now  ready  to  formally  state  our  convergence  results. 


Theorem  5.1  Let  {(x;°,  ,su)*}  and  {(x,s)fc}  denote  the  sequences  generated 
by  the  simplified  MTY  predictor-corrector  algorithm.  Then 

(i)  The  safeguard  in  the  corrector  step  is  activated  only  a  finite  number  of 
times. 


(ii)  The  algorithm  has  iteration  complexity  0(y/nL). 

(in)  The  duality-gap  sequence  {.r°  .s0}  converges  quadratically  to  zero. 

(iv)  Both  sequences  {(x°,.s°)}  and  { ( x ,  .s ) }  converge  to  a  point  (x,s)  in  the 
optimal  face. 


Proof.  Property  (i)  follows  from  Proposition  5.5.  Also  (ii)  follows  from 
(iv)  of  Proposition  5.2  in  a  standard  manner.  See  Mizuno,  Todd,  and  Ye  [10] 
for  details.  Property  (iii)  is  a  combination  of  (i),  (ii),  and  (iii)  of  Proposition 
5.2.  Finally  (iv)  is  (ii)  of  Proposition  5.4.  ■ 


6  Concluding  Remarks 

The  fact  that  so  much  of  Theorem  5.1  follows  from  Proposition  5.2  and 
Proposition  5.2  depends  so  little  on  the  corrector  step  leads  us  to  take  a 
closer  look  at  the  role  of  the  corrector  step  in  our  convergence  theory. 

Consider  a  typical  simplified  MTY  predictor-corrector  iteration  repre¬ 
sented  by  {(x0,  s°),  (x,  s),  (x+,  s+)}.  The  predictor  step  takes  (x°,  50)  to  (x,  s) 
and  the  corrector  step  takes  (x,.s)  to  (x+,s+).  A  close  look  at  the  derivation 
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of  our  theory  shows  tha.t  tor  the  establishment  of  0{y/nL)  complexity  and 
quadratic  convergence  we  only  used  the  fact  that  the  corrector  step  satisfies 

(i)  X+Ts+  <  xTs 

and  (31) 

(ii)  6'(.x+,.s+)  <  a/2. 

Hence  any  corrector  step  satisfying  (34)  will  lead  to  0(y/nL)  complexity  and 
quadratic  convergence,  but  not  necessarily  iteration  sequence  convergence. 
It  follows  that  quadratic  convergence  is  the  best  that  should  be  expected 
from  either  the  MTY  algorithm  or  the  simplified  MTY  predictor-corrector 
algorithm.  This  is  because  for  both  these  algorithms  the  corrector  step  does 
not  improve  the  duality-gap,  i.e.  x+Ts+  =  xTs,  and  therefore  the  quadratic 
decrease  is  obtained  entirely  from  the  damped  Newton  predictor  step,  and 
quadratic  decrease  (in  general)  is  optimal  for  a  (damped)  Newton  method. 
Clearly  the  same  is  true  for  any  corrector  step  that  does  not  decrease  the 
duality-gap. 

We  are  accustomed  to  expect  cubic  decrease  from  the  pair  consisting  of  a 
Newton  step  and  a  simplified  Newton  step  and  quartic  decrease  from  the  pair 
consisting  of  two  Newton  steps.  In  order  to  attain  these  objectives  along  with 
O(yJnL)  complexity  the  predictor- corrector  approach  will  have  to  be  modified 
so  that  the  corrector  step  still  satisfies  (34)  but  also  gives  the  appropriate 
decrease  in  the  duality-gap.  For  example  if  in  the  simplified  corrector  step 
of  Algorithm  5.1  we  replace  p  with  7//.  and  the  safeguard  is  activated  only 
a  finite  number  of  times,  then  we  would  obtain  cubic  convergence  from  the 
simplified  MTY  algorithm.  We  did  not  pursue  this  issue  in  the  present  work. 

The  contribution  of  this  paper  is  the  demonstration  that  in  the  MTY 
predictor-corrector  algorithm  the  Newton  corrector  step  can  be  replaced  with 
a  safeguarded  simplified  Newton  corrector  step  and  all  the  algorithmic  prop¬ 
erties  are  maintained,  except  that  the  convergence  of  the  iteration  sequence 
is  no  longer  to  the  analytic  center.  Whether  this  loss  is  important  or  not 
clearly  depends  on  the  application. 
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